A SIMPLE PROOF OF THE INVARIANT TORUS THEOREM 

JACQUES FEJOZ 



Abstract. We give a simple proof of Kolmogorov's theorem on the persistence of a 
quasiperiodic invariant torus in Hamiltonian systems. The theorem is first reduced to 
a well-posed inversion problem (Herman's normal form) by switching the frequency 
£\1 ' obstruction from one side of the conjugacy to another. Then the proof consists in 

s.^ applying a simple, well suited, inverse function theorem in the analytic category, which 

itself relies on the Newton algorithm and on interpolation inequalities. A comparison 
with other proofs is included in appendix. 
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1. The invariant torus theorem 

Let H be the space of germs along Tq := T n x {0} of real analytic Hamiltonians in 
T"xr = {(0,r)} (T n = R n /2TrZ n ). The vector field associated with H £ H is 

H : 8 = d r H, r = -d 9 H. 

For a £ R n , let /C be the affine subspace of Hamiltonians K S H. such that A|t" is 
constant (i.e. Tq is invariant) and A"|x n = a. Those Hamiltonians are characterized by 
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2 JACQUES FEJOZ 

their first order expansion along Tq, of the form c + a ■ r for some c£l, that is, their 
expansion is constant with respect to 6 and the coefficient of r is a. 

Let 

D 7 , r = {a G M n , \/k G Z n \ {0} \k ■ a\ > j\k\~ T }, \k\ := |Jfe|i = |fci| + • • • + \k n \. 

If r > n— 1, the set U 7 >oD 7iT has full measure ( [Arnold! . 1 19631 . p. 83]). See appendix [El 

Theorem 1 (iKolmogorovl [l954J | . IChierchial [2008] ) . Let a G D 7jT and K° G /C suc/i i/iai 



i/ie averaged Hessian 

d 2 K° 

is non degenerate. Every H £ 7i close to K° possesses an a-quasiperiodic invariant 
torus. 

This theorem has far reaching consequences. In particular it has le d to a partial answer 
to the long standing question of th e stab ility of the S olar system (lArn old 19641. [Feioz 



20041 ] . ICelletti and Chierchial [2007J ]). See lBostl fl~986l ] . ISevrvukl [2003t ]. !de la Llavd [2001 ] 



for references and background. 

Kolmogorov's theorem is a consequence of the following normal form. Let Q be the 
space of germs along Tg of real analytic exact symplectomorphisms G in T n x W 1 ' of the 
following form: 

G(e,r) = ( V (6), t v'(9)- 1 (r + p(e))), 

where ip is a real analytic isomorphism of T n fixing the origin, and p is an exact 1-form 

onT". 

Theorem 2 (Herman). Let a G D 7)T and K° G fC. For every H G rl close enough to 
K° , there exists a unique (K,G,f3) G /C x Q x R n close to (-ftT°,id,0) such that 

H = KoG + (3-r 

in some neighborhood o/G _1 (Tq). Moreover, f3 depends C 1 -smoothly on H. 

In other words, the orbits of Hamiltonians K G /C under the action of symplectomor- 
phisms of Q locally form a subspace of finite codimension n. The offset f3 ■ r usually 
breaks the dynamical conjugacy between K and H; hence Herman's normal form is of 
geometrical nature and can be called a twisted conjugacy. The strategy for deducing 
the existence of an //-invariant torus (namely, G~ 1 (Tq)) from that of a ET-invariant 
torus (namely, Tq) is to show that /3 vanishes on some subset of large measure in some 
parameter space (in some cases, the frequency a cannot be fixed and needs to be varied). 

In the paper, 0(r n ) will denote the ideal of functions of (6,r) of the n-th order with 
respect to r. 

Proof of theorem.^ assuming theorem^ Let K'^iO) := ^ « ,'i (0)0)- Let F be the ana- 
lytic function taking values among symmetric bilinear forms, which solves the cohomo- 
logical equation C a F = K% — J T n K^dO (see lemma [5]), and ip be the germ along Tq of 

the (well defined) time-one map of the flow of the Hamiltonian F(9) ■ r 2 . The map (p is 
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symplectic and restricts to the identity on Tg. At the expense of substituting K° o ip 
and H o ip for K° and H respectively, one can thus assume that 

K° = c + a-r + Q-r 2 + 0(r 3 ), Q := I K°(6)d9. 

The germs so obtained from the initial K° and H are close to one another. 

Consider the family of trivial perturbations obtained by translating K° in the direction 
of actions: 

K° R {9, r) := K°(9, R + r), Re K n , R small, 
and its approximation obtained by truncating the first order jet of K R along Tq from 
its terms 0(R 2 ) which possibly depend on 9: 

K° R (9, r) := (c + a ■ R) + (a + 2Q ■ R) ■ r + 0(r 2 ) = K R + 0(R 2 ). 

For the Hamiltonian K R , Tq is invariant and quasiperiodic of frequency a + 2Q ■ R. 
Hence the Herman normal form of K R with respect to the frequency a is 

K° R =(K R -P R -r)oid+P R -r, P° R :=2Q-R. 

2Q is invertible and the map R h-> f3°(R) is a local 



R=0 



By assumption the matrix -^ 
diffeomorphism. 

Now, theorem [2] asserts the existence of an analogous map R —> /3(R) for H R , which is 
a small C 1 -perturbation of R i— > J3°(R), and thus a local diffeomorphism, with a domain 
having a lower bound locally uniform with respect to H. Hence if H is close enough to 
K° there is a unique small R such that /3 = 0. For this R the equality H R = K o G 
holds, hence the torus obtained by translating G~ 1 (Tq) by R in the direction of actions 
is invariant and a-quasiperiodic for H. □ 

Exercise 3 Simplify this proof when K° = K°{r) is integrable. 

It is the aim of the rest of the paper to prove theorem O by locally inverting some 
operator 

4> : (K, G, 0) (-»■ H = K o G + ■ r 
when a is diophantine. 

2. COMPLEXIFICATION AND THE FUNCTIONAL SETTING 

For various sets U and V, A(U, V) will denote the set of continuous maps U — > V which 
are real analytic on the interior U, and A(U) := A(U, C). 

Recall notations for the abstract torus and its embedding in the phase space: 

T n = M n /27rZ" and T^ = T n x {0} C T n x R n . 

Define complex extensions 

Tl = C n /2^Z" and Tg = Tg x C n 
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as well as bases of neighborhoods 



T^ = {0ETg max |Im%| < S } and T? = {(0,r) E Tg, |(0,r)l < s}, 

l<j<n 



with |(6*, r)| := maxi<j< n max (|Im#j 



>l r il 



2.1. Spaces of Hamiltonians. - Let 7i s = .A(T™), endowed with the Banach norm 

\H\ S := sup \H(9,r)\, 
(0,r)eT» 

so that % be the inductive limit of the spaces 7i s . 

- For a E R n , let /C s be the affine subspace consisting of those K E % s such that 
K(9, r) = c + a ■ r + 0(r 2 ) for some ceR. 

- If G is a real analytic isomorphism on some open set of Tg and if G is transverse to 
T™, let GM(TJ) := ^(G -1 ^?)) be endowed with the Banach norm 

\H\g,s '■= \H o G \ s . 

2.2. Spaces of conjugacies. 

2.2.1. Diffeomorphisms of the torus. Let T> s be the space of maps if E .4,(T™,Tg) which 
are analytic isomorphisms from T™ to their image and which fix the origin. 

Let also 

Xs := {v E A(T?) n , «(0) = 0} 
be the space of vector fields on T" which vanish at 0, endowed with the Banach norm 

\v\s '■= max max |i/,-(0)|. 

9&T™ l<j<n J 

According to corollary W5\ the map 

crB* +a := {v E Xs+a, \v\ s < a} -> V s , v H> id+u 

is defined and locally bijective. It endows T> s with a local structure of Banach manifold 
in the neighborhood of the identity. 

We will consider the contragredient action of T> s on T™ (with values in Tg) : 

i p(e,r):=( i p(e), t i p'(e)- 1 -r), 

in order to linearize the dynamics on the alleged invariant tori. 

2.2.2. Straightening tori. Let B s be the space of exact one- forms over T™, with 

6>eT™ l<j<n 

We will consider its action on T™ by translation of the actions: 

p(8,r):=(e,r + p(9)), 
in order to straighten the perturbed invariant tori. 



A SIMPLE PROOF OF THE INVARIANT TORUS THEOREM 5 

2.2.3. Our space of conjugacies. Let Q s = T> s xB s , identified with a space of Hamiltonian 
symplectomorphisms by 

(^ P )(9,r) -.= ipo P (e,r) = ( i p(e), t i p'(e)- 1 (r + p(e))). 

Endow its tangent space at the identity T\&G S = g s := \s X £>s with the norm 

\G\ S = \(v,p)\b ■= max(|v| a , \p\ 8 ), 

and its tangent space at G = (ip, p) with the norm 

\6G\ S := \SG o CT 1 ],, 5G£T G G. 

Here and elsewhere, the notation 5G, as well as similar ones, should be taken as a whole; 
there is no separate 5 6l in the present paper. 

Also consider the following neighborhoods of the identity: 

gZ = lGeg 8 , max \(e-9,R-r)\ <a, (Q,R) = G(0,r)\, a > 0. 

The operators (commuting with inclusions of source and target spaces) 

(f> s : E s := K s+a x^xlM^, (K,G, /3) ^ K o G + (3 ■ r 
are now defined. 

3. Local twisted conjugacy of Hamiltonians 

Theorem 4. Let a E D 7jT . For all0<s<s + a< 1, 4> s + a has a local inverse: if 
\H — K°\ S+(T is small, there is a unique (K,G,/3) E E s , \ ■ \ s -close to (K°,id, 0) such that 
H = KoG + f3-r. Moreover ft o (j)" 1 is a C 1 -function locally in the neighborhood of K° 
in U s+a . 

This entails theorem[2]and itself follows from the inverse function theorem of appendixIA"! 
from lemma [TT1( for the uniqueness) and from corollary [13] (for the smoothness of /3o^> _1 ). 
We will now check the two main hypotheses of appendix[X](one on cfr'^ 1 and one on eft"). 

Let C a be the Lie derivative operator in the direction of the constant vector field a : 

df 



l<j<n i 

We will need the following classical lemma in two instances in the proof of lemma [6) 

Lemma 5 (Cohomological equation). If g E A(T^ +a .) has 0-average (J^gdO = 0), there 
exists a unique function f E A(T™) of 0-average such that C a f = g, and there exists a 
Cq = Cq{u,t) such that, for any a: 

\f\s < Co^a-^lgUr. 
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Ak-9 



Proof. Let g(6) = Xlfcez n \ioi 9k e ' 3e ^ e Fourier expansion of g. The unique formal 
solution to the equation C a f = g is given by f(0) = J2kez n \{0} lk^ e%kS ■ 
Since g is analytic, its Fourier coefficients decay exponentially: we find 

— |y|s+<r c 



\9k\ 



f g{e)e -^de_ 

J T „ 2tt 



by shifting the torus of integration to a torus Ira9j = ±(s + a). 

Using this estimate and replacing the small denominators k ■ a by the estimate defining 
the diophantine property of a, we get 

where the latter sum is bounded by 

/*00 /'OO 

J 1 J no 

/•oo 

< o- T - n e na \ l T+n - l e- e oil = o-- T - n e na T(T + n). 
Jo 

Hence / belongs to A(T™) and satisfies the wanted estimate. □ 

We will write x = (K, G, (3, c), Sx = (SK, SG, 5(3, 5c) and Sx = {SK, SG, 5(3, 5c). 
Fix 0<s<s + cr<l. 

Lemma 6. There exists C' > which is locally uniform with respect to x G E s in the 
neighborhood of G = id such that the linear map (j)'(x) has an inverse c()'{x)~ l satisfying 

\cp'{x)- 1 ■ 5H\ s < a— n - 1 C" \5H \ Gs+a . 

Proof. A function 5H G G*A(T S+<J ) being given, we want to solve the equation 

S<f>(x) • Sx = 5K o G + K' o G ■ 5G + 5(3 • r + 5c = 5H, 

for the unknowns 5K G T K K S C A(T%), 5G G T G Q S , 5(3 G R n and 5c G R, or, equiva- 
lently, after composing with G _1 to the right, 

5K + K' ■ G + 5(3 ■ r o G" 1 + 5c = H, 

where we have set G := 5G o G' 1 G g s and H := 5H o G' 1 G A(T£). 

More specifically, G~ 1 and G are of the form 

G- 1 (8,r) = (ip~ 1 (6), t i P >o ip -\e)-r-po i p~ 1 (6)), G = (<p,p-r- <p% 
where (p G Xs+o and /> G B s + a , and we can expand 

A" = a-r + #2(0)-r 2 + O(r 3 ) and H = H o (0) + #i(0) ■ r + Q(r 2 ). 
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The equation becomes 

(1) [p ■ a + 5c - p o ip- 1 ■ 5(3] + r • [-tp 1 ■ a + <p' o ip' 1 ■ 5(3 + 2K 2 ■ p] + 

K = H + 0{r 2 ), 

where the term 0(r 2 ) in the right hand side depends only on K and G, and not on 
K. The equation turns out to be triangular in the five unknowns. The existence and 
uniqueness of a solution with the wanted estimate follows from repeated applications of 
lemma [5] and Cauchy's inequality: 

- The average over Tq of the first order terms with respect to r in equation (pQ) yields 

sp = f f <p'o ^~ x dd] ■ [ H x dd, 

V^T" / JTJ 

which does exist if <p is close to the identity (proposition I14|) , 

- Similarly the average of the restriction to Tq of (pQ) yields: 

5c = [ H Q d9+ I pop- 1 d9-5(3. 



- Next, the restriction to Tg of ([T]) can be solved uniquely with respect to 5p according 
to lemma [5] (applied with p = /'). 

- The part of degree one can then be solved for ip similarly. 

- Terms of order > 2 in r determine K. □ 

Lemma 7. There exists a constant C" > which is locally uniform with respect to 
x G E s+a in the neighborhood of G = id such that the bilinear map <p"{x) satisfies 

\4>"{x) ■ 5x ® 5x\ G s < a^ 1 C" \5x\ s+a \5x\ s+a . 

Proof. Differentiating cp twice yields 

<f)"(x) ■ 5x ® 5x = 5K' o G ■ 5G + 5K' o G ■ 5G + K" o G ■ 5G ® SG, 
whence the estimate. D 

A. AN INVERSE FUNCTION THEOREM 



Si 



Let E = (E s )q <s< i be a decreasing family of Banach spaces with increasing norms 
and eBf = {x G E s , \x\ s < e}, e > 0, be its balls centered at 0. 

Let (F s ) be an analogous family. Endow F with additional norms \-\ xs , x € E s , < s < 
1, satisfying 

|y| , s = \y\ s and \y\ x , s < |y| a!iS+ |- B /_ a .| 4 • 

These norms allow for dealing with composition operators without artificially loosing 
some fixed "width of analyticity" a at each step of the Newton algorithm. 
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Let (j) : aBf +a — >■ F s , s < s + a, </>(0) = 0, be maps commuting with inclusions, twice 
differentiable, such that the differential (fi'(x) : E s+a — > F s has a right inverse (p'(x)^ 1 : 
F s+a ->■ £? s , and 

with C, C", t', t" > 1. Let C := C7'C" and r := r' + r". 

Theorem 8. is locally surjective and, more precisely, for any s, n and a with n < s, 

eB^CcPinBf), e := T^C^o^r). 

In other words, <fi has a right-inverse rp : eB^ +a — > rjBf. 

Proof. Some numbers s, r\ and a and y G Bf +>q being given, let 

/ : aB? +r]+(T ^E s , x^x + 4>'{x)-\y - <f>(x)) 

and 

Q : crBf +a x aBf +a -)■ F s , (x,x) h-> <£(£) - </>(x) - 4>'{x)(x - x). 



Lemma 9. The function Q satisfies: \Q(x,x)\ xs <2 1 C"a T \x — x\ s+(J 
Proof of the lemma. Let Xt := (1 — t)x + tx. Taylor's formula yields 



+ \x — x\s' 



Q(x,x)= / (l-t)4>"(xt)(x-xydt, 
Jo 

hence 

\Q(x,x)\ XjS < f (l-t)\cp"(x t )(x-x)\ s dt< I (l-t)\cp"(x t )(x-x)\ us+ ^ 



dt. 



>o Jo 

whence the estimate. □ 

Now, let s, r\ and a be fixed, with n < s and y 6 tBf +a for some e. We will see that if e 
is small enough, the sequence xq = 0, x n := / n (0) is defined for all n > and converges 
towards some preimage x £ ??-B,f of y by </>. 

Let (o" ra ) n >o be a sequence of positive real numbers such that 3^a n = a, and (s n )n>o 
be the sequence decreasing from so := s + cr to s defined by induction by the formula 

Assuming the existence of xq, ..., x n+ \, we see that 4>(xk) = y + Qixk-i, Xk), hence 

Xk+i ~x k = ^(xky 1 ^ - 4>(x k )) = -(p'ixk^Qixk^i^k) (1 < k < n). 

Further assuming that \xk+i — Xk\s k < o~k, the estimate of the right inverse and lemma[9] 
entail that 
I _l <r I — I 2 <r <r 2 2 ™ _1 I |2 n_1 ._ o— 1/^> — t 

l^n+l 2-n|s„+i Ji C- n \X n X n —\\ Sn ^ ■ ■ ■ f^ C n C n _-y • • • C-y | ■3^1 1 Sl > Cfc . — Z ^^k • 

The estimate 

N S1 < C"(3a )- T '|y| so < 2- 1 C(J - T e = c e 
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and the fact, to be checked later, that C& > 1 for all k > 0, show : 

2" 

-k 



\x n+1 - x n \ Sn+1 < e Yl 4 



fc>0 

Since ^ n>0 ^" < 2p if 2p < 1, and using the definition of constants Cfe's, we get a 
sufficient condition to have all x n 's defined and to have Y \ x n+\ — x n \ s < rj: 

(2) -fiK 2 ~ fe =§n-r fc . 

k>0 fc>0 

Maximizing the upper bound of e under the constraint 3 Y n >o a n = a yields a^ := f 2 _fc . 
A posteriori it is straightforward that \x n+ \ — x n \ Sn < a n (as earlier assumed to apply 
lemma El) and c n > 1 for all n > 0. Besides, using that Y k2 = Y 2 = 2 we get 

2rj / a \ 2t <7 2r 7y 



2 ll c fc 2 J-J- 2- fc2 - fc vc U; y c 2 vi2 



J) 2 8 -C 2 ' 

fc>0 fc>0 

whence the theorem. □ 

Exercise 10 The domain of ip contains eBg , e = 2~ 12t t~ 1 C~ 2 S 3t , for any S 1 . 

Proof. The above function e(r],a) = 2~^ T C~~ 2 o~ 2t r\ attains is maximum with respect to 
r) < s for rj = s. Besides, under the constraint s + a = S the function e(s,<r) attains its 
maximum when a = 2ts and s = 1 + 2t - Hence, S being fixed, the domain of tp contains 
eflfif 



l + 2r Vl2(l + 2r) 
Given that 5 < 1 < r by hypothesis, it suffices that e be equal to the stated value. □ 

A.l. Regularity of the right-inverse. In the proof of theorem [8] we have built right 
inverses ifi '■ eBf + +a — > i]Bf + , of <p, commuting with inclusions. The estimate given in 
the statement shows that ip is continuous at 0; due to the invariance of the hypotheses 
of the theorem by small translations, ip is locally continuous. 

We further make the following two asumptions: 

- The maps 4>'{x)~ l : F s+a — > E s are left (as well as right) inverses (in theorem 2] we 
have restricted to an adequate class of symplectomorphisms) ; 

- The scale (| • | s ) of norms of (E s ) satisfies some interpolation inequality: 

Ms+o- < \x\ s IxL+o- for all s, a, a = a I 1 H — 

V s 

(according to the remark after corollary 1 16| this estimate is satisfied in the case of interest 
to us, since a + log(l + a/s) < a). 
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Lemma 11 (Lipschitz regularity). If a < s and y,y G tBf +(7 with e = 2~ 14t C~ 3 <7 3t , 

My) - ^{y)\ s < C L \y - y\ s+(7 , C L = 2C'a- T '. 

In particular, ip is the unique local right inverse of <ft, and hence is also its local left 
inverse. 

Proof. Fix n < C < a < s; the impatient reader can readily look at the end of the proof 
how to choose the auxiliary parameters rj and £ more precisely. 

Let e = 2~ Sl ~C~ 2 ( 2T r], and y, y G eP>f +a . According to theorem [HI x := ip(y) and 
x := i[)(y) are in rjBf +u _ ( -, provided the condition, to be checked later, that n < s + a — (. 
In particuliar, we will use a priori that 

\x - x| s+(T _£ < \x\ s+a ^ + |x| s+(T _£ < 2??. 

We have 

x — x = 4>'{x)~ <p'(x)(x — x) 

= <t>'{x)~ l {y-y-Q(x,x)) 

and, according to the assumed estimate on 0'(x) _1 and to lemma [91 

\x — x\ s < Co \y — y\ s+a + 2 CQ \x — x\ s+2rt+ \x-x\ s - 

In the norm index of the last term, we will coarsely bound \x — x\ s by 2rj. Additionally 
using the interpolation inequality: 

,2 , , , ( 1 

\ X ~ X \s+A-q <\ x ~ x \s\X ~ X\ s+ v, O- = 4l] I 1 + - 

yields 

(l - 2~ 1 C(~ T \x - x\ s+a ) \x - x\ s < C'o~ T \y - y\ 
Now, we want to choose rj small enough so that 

- first, a < a — (, which implies \x — x\ s+ a < 2n. By definition of a, it suffices to have 
" - 4(l+l/s) ■ 

- second, 2~ 1 C(~ T 2n < 1/2, or n < ^~t, which implies that 2~ l C(~ T \x — x\ s+ a < 1/2, 
and hence \x — x\ s < 2C'a~ T \y — y\ s + a - 

A choice is C, = ^ and r\ = ja-^ < s, whence the value of e in the statement. □ 

Proposition 12 (Smoothness). For every a < s, there exists e,Ci such that for every 
y,ye eBf +a , 

My) - </>(</) - ^'^(y)rHy - y)\, < c x \y - y\ 2 s+a . 

Moreover, the map tp' : eB^ +a — > L{F s+a ,E s ) defined locally by ip'(y) = (^'(^(y))^ 1 is 
continuous. 



\s+cr- 
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Proof. Fix e as in the previous proof and y,y G eB^ +a . Let x = tp(y), rj = y — y, 
£ = ip(y + rj) — tp(y) (thus rj = (/>{x + £) — (f>(x)), and A := -^(y + ??) — ip(y) — 4>'{x)~ l rj. 
Definitions yield 

A = ( f>'(xy 1 (<f)'(x)t-r]) = -<j)'(xy 1 Q(x,x + 0- 
Using the estimates on </>'(x) _1 and Q and the latter lemma, 



|AL<C7i| 



\s+a' 



for some a' tending to when a itself tends to 0, and for some C\ > depending on a. 
Up the substitution of a by a', the estimate is proved. 

The inversion of linear operators between Banach spaces being analytic, y i— >■ 0(^(y)) _1 
is continuous in the stated sense. □ 

Corollary 13. If it G L(E S , V) is a family of linear maps, commuting with inclusions, 
into a fixed Banach space V, then ir o ip is C 1 and (it o ip)' = 7r • (j)' o ip. 

This corollary is used with -k : (K, G, f3) \— > j3 in the proof of theorem [H 

B. Some estimates on analytic isomorphisms 

In this appendix, we give a quantitative inverse function theorem for real analytic iso- 
morphisms on T". This is used in section [21 to parametrize locally T> s by vector fields, 
and, in lemma to solve the cohomological equation for the frequency offset 6(3. 

Recall that we have set T™ := {9 G C n /2irZ n , maxi<j< n |Im^-| < s}. We will denote 
by p : M™ := R n x i[—s, s] n — > T™ its universal covering. 

Proposition 14. Let v G A{T n s+2(T ,C n ), \v\ s+2(T < a. The map id+v : T" +2fT -> M™ +3(T 
induces a map ip : TT™ +2(T — > T" +3o . whose restriction if : T™ +tT — > T" +2o . has a unique 
right inverse ip : T™ — > T™ +(7 : 

T n s+a <-^Y n s+2a . 



Furthermore, 

and, provided 2o~~ l \v | s +2<t < L 




Ti 



,S 



\i> — id |s < \v\ s+a 



\ip' -id | < 2a 1 \v\ s+2 a- 

Proof. Let $ : W'* +2a ->• W s l +3a be a continuous lift of id+v and k G M„(Z), jfe(i) := 
<3?(x + /) - <&(#). 

(1) Injectivity of & : M™ +(7 -> M" +2(T . Suppose that x,x G M" +(T and $(x) = <*>(£). 
By the mean value theorem, 

\x — x\ = \v(px) — v(px)\ < \v'\ s +it\x — x\, 
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and, by Cauchy's inequality, 

i -| , Hj+ggj n ^ i- i 

x — x < \x — x\ < \x — X , 

hence x = x. 

(2) Surjectivity of <&: R" C $(R" +eT ). For any given y G R™, the contraction 

/ : R" +(T -> R^ +(7 , x^y- W (x) 

has a unique fixed point, which is a pre-hnage of y by $. 

(3) Injectivity of <p : T™ +cr — >■ T™ +2(T . Suppose that px, px G R™ +cr and <p(px) = 
ip(px), i.e. <E>(x) = <3?(x) + k for some k G Z ra . That A; be in GL(n, Z), follows from 
the invertibility of $. Hence, $ (x — A: _1 (k)) = $(x), and, due to the injectivity 
of <3>, px = px. 

(4) Surjectivity of p : T™ C </?(Tg +0 .). This is a trivial consequence of that of <£. 

(5) Estimate on ip := p^ 1 : T™ — > T™ +(7 . Note that the wanted estimate on ip is in 
the sense of * := S" 1 : R™ -» M% +ff . If y G R™, 

*(v)-i/ = -«(p*(i/)), 

hence |^ — id | s < |v| s+cr . 

(6) Estimate on ip' . We have ip' = ip 1 ' 1 o p, where p'~ 1 (x) stands for the inverse of 
the map £ h-> </?'(x) • £. Hence 

V>' — id = p'^ 1 o (^ — id, 

and, under the assumption that 2cr _1 |w| s+ 2cr < 1, 

\i> -ia-\8<\<p -id s +<T<q — m — ^i m ^ 2a \ v \s+2<t- 

1 - \v'\ s+a I- a l \v\ s+ 2a 

D 

C. Interpolation of spaces of analytic functions 

In this section we prove some Hadamard interpolation inequalities, which are used in lA.li 

Recall that we denote by T^ the infinite annulus C n /2irZ n , by T", s > 0, the bounded 
sub-annulus {6 £ Tg, |Im^| < s, j = l...n} and by B™, t > 0, the polydisc {r G 
C n , |r-,| < t, j = l...n}. The supremum norm of a function / G ^4(1"™ x B") will be 
denoted by \f\ Sjt . 

Let < so < si and < to < ti be such that 

log — = si - s . 

Let also < p < 1 and 

s = (1 — p)so + psi and t = t ~ p t^. 
Proposition 15. Iff G .A(T™ x B£), 

|/|s,t < l/Lo,to l/lsi.ti- 
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Proof. Let / be the function on T™ x D? , constant on 2n-tori of equations (Im 9, r) = est, 
defined by 

f(9,r) = max \f ((±9 1 + Ml, -, ±0 n + M„), (n e^ 1 , ...,r n e iu ")) \ 

(with all possible combinations of signs). Since log |/| is subharmonic and T 2n is compact, 
log/ too is upper semi-continuous. Besides, log/ satisfies the mean inequality, hence is 
plurisubharmonic. 

By the maximum principle, the restriction of |/| to T™ x D>" attains its maximum on the 
distinguished boundary of T" x D™. Due to the symmetry of /: 

\f\s,t = f(ise,te), e = (l,...,l). 

Now, the function 

<p(z):=f(ze,e-^ +s he) 
is well defined on T Sl , for it is constant with respect to Rez and, due to the relations 
imposed on the norm indices, if |Imz| < si then |e~^ 2:+s 't| < e Sl ~ s t = t\. 

The estimate 

si-lmz Imz-so . 

log ip{z) < <p{s i) H p{sii) 

si - so «i - «0 

trivially holds if Imz = so or s\, for, as noted above for j = 1, e Sj ~ s t = tj, j = 0, 1. 

But note that the left and right hand sides respectively are suharmonic and harmonic. 

Hence the estimate holds whenever sq < lmz < si, whence the claim for z = is. D 

Recall that we have let T^ := T^ x D™, s > 0, and, for a function / G -4(T"), let 
\f\s = \f\s,s denote its supremum norm on T™. As in the rest of the paper, we now 
restrict the discussion to widths of analyticity < 1. 

Corollary 16. If a x = - log (l - f) and f G A(T«), 



|/ls < \f\s~a \f\s+cn- 



o-ol 



In lA.H we will use the equivalent fact that, if a = s + log (l + f ) and / G A(T™ +5 .), 

1/lsV, < \f\s\f\ s+ a- 

Proof. In proposition 115} consider the following particular case : 

• p = 1/2. Hence 

so + s\ . , 

s = and t = yioH- 

• s = t. Hence in particular io = se so ~ s and t\ = se si ~ s . 

Then 

l/ls = \f\s,s ^ l/Uo,*ol/Ul,*l' 

We want to determine max(so,to) & n d max(si,ti). Let o\ := s — so = s\ — s. Then 
to = s e _cri and t\ = s e ai . The expression s + a — se a has the sign of a (in the relevant 
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region < s + a < 1, < s < 1); by evaluating it at a = dbci, we see that so < to and 

si > h. 

Therefore, since the norm \ ■ \ s ,t is non-decreasing with respect to both s and t, 

|/ls < |/k,*ol/U,si = l/kl/U 

(thus giving up estimates uniform with respect to small values of s). By further setting 
ao = s — to = s (I — e~ ai ), we get the wanted estimate, and the asserted relation between 
<7o and a i is readily verified. □ 

D. Weaker arithmetic conditions of convergence 

In this section, we look more carefully to the arithmetic conditions needed for the in- 
duction to converge, in the proof of the inverse function theorem [8J 

A function A : N* — > [l,+oo[ being given, define the set Da as the subset of vectors 
a eW 1 such that 

Ifc-al ^ 1 ^"^"' 1 (VfceP\{0}). 

( The function A is just some other nor malization of what is an approximation function 



m 



Riissmannl 19751 ] or a zone function in lDumas et al.1 [20041 ] .) For Da to be non empty 



trivially we need lim +00 A = +oo. 

Proposition 17. The conclusions of theorems [7] and{I\ hold of there exist c > and 
5 e]0, 1[ such that 

Y^ A(£)e~^ 2 < exp (c 2 S A as j -»• +oo. 

Example 18 The Diophantine set D 7)T corresponds to a polynomially growing function 
A, and to a polynomially growing function Yle>i A(£)e~ e2 3 . A foriori, ^^ >1 A(£)e~ e '^ 
is at most polynomially growing. 

Proof. Call L the discrete Laplace transform of A: 

e>i 

and assume it is finite for all a > 0. Patterning the proof of lemma[5l we get the following 
generalization. 

Lemma 19. Let g £ A(T™ +(T ) having 0-average. There is a unique function f E A(T™) 
of zero average such that C a f = g. This function satisfies 

2 n e 

\f\ s <CL(a)\g\ s+a , C 



(n-l)[ 



(Again, see iRiissmannl [1975 ] for improved estimates. But such an improvement is not 



the crux of our purpose here.) 
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Taking up the proof of the inverse fuction theorem of appendix A with our new estimates 
(see in particular equation d2D), we see that the Newton algorithm converges provided 



E* 



J log L((Tj) < CO, 



i>o 

for some choice of the converging series YJ <jj. Choosing YJ o~j = YJ j -2 , we see that it is 
enough that log L{(jj) < c 2 J for some c > and 5 e]0, 1[, whence the given criterion. D 

E. Comments 

Section 1. The proof of Kolmogorov's theorem presented here differs from others chiefly 
for the following reasons: 

- The seeming detour through Herman's normal form reduces Kolmogorov's t heore m 
to a functionally well posed inversion problem (compare with IZehnderl [19751 . Il97£ 



This powerful trick consists in switching the frequency obstruction (obstruction to the 
conjugacy to the i nitial dynamic s) from one side of the conjugacy to the other. It was 
extensively used in iMoserl 19671 ] . The remaining, finite dimensional problem is then to 



show that the frequency offset f3 G M n may vanish; in general, it is met using a non- 
degeneracy hypothesis of one kind or another. Looking backward, this last step is not 
the most difficult, bu t wa s probably not well understood before M. Herman in the 80s 
(see iRiissmannl [19901 ] and ISevrvukl [1999J ] ) . The functionnal setting chosen here adapts 



to more de generate cases, including lower dimensional tori, in a straightforward manner 



fs ee iFeioa 1200411 : compare to Herman's prefered proof for Lagrangian tori, as exposed 
in lBostJ [l98f ~ 



- Classical perturbat ion ser i es (or some modification of these) have been shown to con- 
verge in som e cases (piegell [1942] for the convergence of Schroder series in the Siegel 



problem, see lEliassonl [199a ] for Lindstedt series of Hamiltonians) . Direc t methods for 



proving their convergence are involved because, as J. Moser noticed in [Moserl . 119671 . 
p. 149], these series do not converge absolutely, and thus the proof of semi-convergence 
must take into account compensations or the precise accumulation of small denomina- 
tors through a subtle combinatorial analysis. On the other hand, the perturbation series 
yielded by the Newton algorithm are absolutely convergent, provided that one adequatly 
chooses the width of analytic spaces at each step of the induction. This was a major dis- 
covery of Kolmogorov. In the first approximation, the series so obtained can be thought 
of as obtained by grouping terms of the classical perturbation series (from step j to step 
j + 1, the non resonant terms of size e 2J , • • • , e 2J _1 are eliminated). The magics is that 
compensations are taken into account without noticing, and it would be interesting to 
understand how classical and Newton series relate precisely, maybe with mould calculus. 

- We encapsulate the Newton algorithm in an abstract inverse function theorem a la 
Nash-Moser. The algorithm indeed converges without any specific hypothesis on the 
internal structure of the variables. At the expense of some optimality, ignoring this 
structure allows for simple estimates (and control of the bounds) and for solving a whole 
class of analogous problems with the same toolbox (lower dimensional tori, codimension- 
one tori, Siegel problem, as well as some problems in singularity theory). 
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- The analytic (or Gevrey) category is simpler, in Nash-Moser theory, than Holder or 
Sobolev categories because t he Newton algorithm can be c arried out without intercalat- 
ing smoothing operators (cf. ISergeraertl 19721 ]. iBostl 1986 



- Incidentally, Hadamard interpolation inequalities are simple to infer for analytic norms 
because, a gain, they do not depend on regularizing operators, as it is shown in appen- 
dix[C](cf. [Hormanderl . Il976l . Theorem A. 5]). 

- The use of auxiliary norms (| • \g, s in lemmas \5\ and | • l^ in appendix [X]) prevents 
from artificially loosing, due to compositions, a fixed width of analyticity at each step of 
the Newton algorithm -the d omains of analytic ity being deformed rather than shrunk. 
As a pitfall, the argument of Jacobowita . [1972J, Sections 5 and 6] to deduce an analytic 
function theorem in the smooth category abstractly from the theorem in the analytic 
category, does not apply directly here (see comment below). 



Section 1. Theorem^ Herman's normal form is the Hamiltonian analogue of the normal 
form o f vector fields on the t orus in the neighborhood of Diophantine constant vector 
fields ( Arnold! 19611 ] . iMoserl [1966al ]). The norm a l form for Hamiltonians implies the 
normal form for vector fields on the torus [Feioa . 120041 . Theoreme 40] and is actually 
simpler to prove from the algebraic point of view. 



Section 3. Lemma \^ The estimate is obtained by bounding the terms of Fourier series 
one by one. In a more careful estimate, one should take into account the fact that if 
\k ■ a\ is small, then k' ■ a is not so small for neighboring k h s. Thi s allows to find the 
optimal exponen t of a, making it independant of the dimension; see IMoserl 1966bj ] and 



Riissmannl 197a ]. 



AppendixlA\ Theorem^ - The two competing small parameters r/ and a being fixed, 
our choice of the sequence (o~ n ) maximizes e for the Newton algorithm. It does not 
modify the sequence (xk) but only the information we retain from (x)-)- 

- In the expression of e, the square exponent of C is inherent in the quadratic convergence 
of Newton's algorithm. From this follows the dependance, in KAM theory, of the size e 
of the allowed perturbation with respect to the small diophantine constant 7: e = 0(7 2 ). 

- The method of lJacobowita 19721 ] (see IMoserl 1966b| ] also in order to deduce an inverse 
function theorem in the smooth category from its analogue in the analytic category does 
not work directly, here. The idea would be to use Jackson's theorem in approximation 
theory to caracterize the Holder spaces by their approximation properties in terms of 
analytic functions and, then, to find a smooth preimage x by of a smooth function y 
as the limit of analytic preimages Xj of analytic approximations yj of y. However, in 
our inversion function theorem we require the operator </> to be defined only on balls 
oB s+a with shrinking radii when s + a tends to 0. This domain is too small in general to 
include all the analytic approximations yj of a smooth y. Such a restriction is inherent in 
the presence of composition operators. IJacobowita 19721 ] did not have to deal with such 
operators for the problem of isometric embeddings. Yet we could generalize Jacobowitz's 
proof at the expense of making additionnal hypotheses on the form of our operator </>, 
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which would take into account the specificity of directions K and G, as well as of the 
real phase space and of its complex extension. 

Appendix \A.l[ It is possible to p rove that ip is C 1 without additional asumptions, just 
by patterning Sergeraertl . Il972l . p. 626]). Yet the proof simplifies and the estimates 



improve under the combined two additional asumptions. In particular, the existence of 
a right inverse of (j)'{x) makes the inverse ip unique and thus allows to ignore the way it 
was built. 

Appendix Wi. We include this elementary section for the sake of completeness, although 
the quantitative estimates are needed only if one wants a quantitative version of Kol- 
mogorov's theorem, with an expli cit value of e. A similar proposition (for germs at a 
point of maps in C n ) is proved in IPoschell 2001 ] using a more sophisticated argument 



from degree theory. 

Appendix\C\ In this paragraph, the obtained inequalities generalize the standard Hada- 
mard convexity inequalities. They are optimal and show that analytic norms are not 
quite convex with respec t to the width of th e complex extensions, due to the geometry 
of the phase space. See [Narasimhanl . Il99a . Chap. 8] for more general but less precise 
inequalities. 

AppendixYEX Proposition\ll\ There are reasons to believe that the so obtained arithmetic 
condition is not optimal. Indeed, solving the exact cohomological equation at each 
step is inefficient because the small denominators appearing with intermediate-order 
harmonics deteriorate the estimates, whereas some of these harmonics could have a 
smaller amplitude than the error terms and thus would better not be taken care of. 
Even stronger, Riissmann and Poschel remarkably and recently noticed that at each 
step it is worth neglecting part of the low-order harmonics themselves (to some carefully 
chosen extent). Then the expense, a worse error term, turns out to be cheaper than that 
the gain -namely, the right hand side of the cohomological equation now has a smaller 
size over a larger complex extension. This allows, with a slowly converging sequence of 
approximations, to show the persistence of invariant tori under some arithme t ic con dition 



which, in one dimension, is equivalent to the Brjuno condition; see IPoschell [2009] 



Thank you to P. Bernard, A. Chenciner, R. Krikorian, I. Kupka, D. Sauzin and J.-C. 
Yoccoz, for illuminating discussions, and to A. Albouy and A. Knauf for careful reading 
and correcting. 
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